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CLAIMS 



[Claim(s)] 

[Claim 1] Phonation vicarious execution equipment characterized by providing the 
following. A signal-detection means to detect the signal generated by the movement of 
the muscles by phonation operation. A syllable discernment means to generate the 
recognition signal corresponding to desired syllable while performing this phonation 
operation. A study means to make into an input configuration the signal detected by this 
signal-detection means, and to learn by using as a teacher pattern the recognition signal 
corresponding to this syllable generated by this voice discernment means. A recognition 
means to make an input configuration the signal detected by this signal-detection means, 
and to recognize and output the phonation syllable signal corresponding to this input 
configuration, a speech synthesis means to compound this phonation sound signal 
outputted by this recognition means, and to change into voice, and a voice output means 
to output as voice this sound signal changed by this speech synthesis means. 
[Claim 2] The aforementioned signal-detection means is phonation vicarious* execution 
equipment including the amplification means for amplifying the myo-electric-signal signal 
detected from two or more skin surface electrodes and these skin surface electrodes for 
the movement of the skin detecting a signal, a filter means to remove the low-frequency 
component and high frequency component of the signal amplified by this amplification 
means, and a signal transformation means to change into a power spectrum the signal 
filtered by this filter means according to claim I . 

[Claim 3] The aforementioned study means and a recognition means are phonation 
vicarious execution equipment according to claim I which uses a neural network. 



DETAILED DESCRIPTION 



[Detailed Description of the Invention) 
[0001] 

[Industrial Application] As opposed to the non-pharynx person who this invention requires 
for phonation vicarious execution equipment, receives ablation of the pharynx wiih a 
certain illness especially, and cannot do phonation The skin surface myo-electric-signal 
pattern detected from the skin surface electrode with which a gena and the mandible section 
were equipped when you wanted to utter syllable and it meant is recognized. The syllable 
which meant voice is recognized and it is related with the phonation vicarious execution 
equipment which changes to the artificial pharynx which synthesizes voice by recognizing 
the myo-elcctric-signal pattern which utters the recognized voice artificially by the voice 
synthesizer. 
[0002| 
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[Description of the Prior Art| Although normal phonation becomes impossible to the 
non -pharynx person who lost the pharynx with a certain illness, the method of applying 
vibrator out of a throat and sending a sound source into the pharynx, and the method ol" 
inserting a pipe into a mouth and sending it directly are to give phonation of these men. 
1 0003 J Moreover, the movement of a phonation person's lip is incorporated by the picture, 
and there is lip leading by the computer which carries out recognition processing. 
[0004] Furthermore, there is research of processing the myo-clectric-signal signal generated 
in connection with the movement of the muscles around a mouth, and discriminating a 
vowel kind. This Reference "Noboru Sugic ct. al. and A specach Employing a Speeach 
Synthesizer Vowel Discrimination from Perioral Muscles It is shown in Activities and 
Vowel Production, IEEE transactions on BiomcdicalEnginceri ng, Vol.32, No. 7, and 
pp485-490. M After letting a hand pass filter pass for a myo-eleclric-signal signal, the 
number of times of intersection of a threshold is counted, and it discriminates from live 
vowels (a, i> u, c, o). 
10005J 

[Problem(s) to be Solved by the Invention | However, each of methods of sending in a 
sound source using the above-mentioned conventional vibrator, and methods using a pipe 
has unnatural sound, and since il becomes sound like a buzzer, other persons cannot 
recognize the sound easily as voice. 

[0006] Moreover, lip leading by tJie computer is dependent on that the recognition rate by 
the image processing is low, and the recognition capacity of a computer. 
|0007| Furthermore, although the method of discriminating a vowel with a 
rnyo-elcctric-signai signal can discriminate a vowel, it has about a consonant the problem 
of not being completely discriminate. 

10008| this invention was made in view of the above-mentioned point, solves the 
above-mentioned conventional trouble, and although il cannot utter normally by a certain 
reason, il aims al offering the phonation vicarious execution equipment which can output 
the synthesized speech based on the myoelectric potential of the mouth circumference to a 
sake. 
[0009] 

| Means for Solving the Problem] Drawing.! is the principle block diagram of this invention. 
[001 0) A signal-detection means 10 to detect the signal generated by the movement of the 
muscles by phonation operation, A syllable discernment means 1 1 to generate the 
recognition signal corresponding to desired syllable while performing phonation operation, 
A study means 1 2 to make into an input configuration the signal detected by the 
signal-detection means 10, and to learn by using as a teacher pattern the recognition signal 
corresponding to the syllable generated by the voice discernment means 11 , A recognition 
means 1 3 to make an input configuration the signal detected by the signal-detection means 
10, and to recognize and output the phonation syllable signal corresponding to an input 
con figuration, lihasa********** means 14 to compound the phonation sound signal 
oulputled by the recognition means 13, and to change into voice, and a voice output means 
15 to output as voice the sound signal changed by the speech synthesis means 14. 
[001 1] Moreover, the signal-detection means 10 of this invention includes the amplification 
means for amplifying the myo-elcetric-signal signal detected from two or more skin surface 
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electrodes and skin surface electrodes for the movement of the skin detecting a signal, a 
filter means to remove the low- frequency component and high frequency component of the 
signal amplified by the amplification means, and a signal transformation means to change 
into a power spectrum the signal filtered by the filter means. 

[0012] Moreover, a neural network is used for the study means 12 and the recognition 

means 13 of this invention. 

[0013] 

[Function] this invention introduces into the recognition section the neural network who is 
a processor with a high recognition rate strong against study nature and noise, inputs the 
signal acquired from phonation operation used as study data by this neural network, and the 
syllable correspondence signal corresponding to phonation operation as teacher data, and 
performs study processing. Change into frequency spectrum after study the 
myo-clcctric-signal signal delected by phonation operation of a user from two or more skin 
surface electrodes by FIT (fast Fourier transform), and the changed spectrum is recognized. 
The meant voice syllabic is recognized, it can synthesize voice with the myo-elcclric-signal 
signal by phonation operation of a user, and the phonation vicarious execution equipment 
which carries out a voice output can be offered, and when a non-pharynx person performs 
phonation operation by everyday life, the synthesized speech near natural voice can be 
outpulled. 
[0014] 

[Example] Hereafter, the example of this invention is explained with a drawing. 

[0015] The outline of this invention is explained first. 

1 0016] Drawing.2 is drawing for explaining the outline of this invention. 

1 00 1 7 1 First, as shown in this drawing (A) ? a user performs utterance operation by moving a 

mouth for desired syllable using utterance equipment, and the signal according to the 

operation is detected. In order to use this signal as study data, signal transformation 

processing is performed, and it inputs into a neural network. Furthermore, a recognition 

signal is generated with utterance operation using the device which can create the signal 

corresponding to syllable, such as a keyboard, and let this be teacher data of study 

processing. A neural network learns with the. study data and teacher data which were 

inputted. 

[0018] Study processing is ended, and in order for the user who performed study processing 
to perform transform processing so that the signal which performed utterance operation 
from utterance equipment and was acquired by this operation can be inputted into a neural 
network, and to perform recognition processing, ii inputs into a neural network. Since the 
neural network is learning previously, he recognizes the inputted signal and outputs a 
recognition result. It synthesizes voice from the signal which it is as a result of recognition, 
and outputs as voice. 

1 001 9 1 In this example, although utterance operation is performed using utterance 
equipment, a user docs not utter voice, but this detects the signal uttered by moving the 
surrounding muscles of a mouth, and performs study processing and recognition processing 
based on this signal. 

[0020] Drawing 3 is drawing for explaining the system of one example of this invention. 
[0021] As shown in this drawing, utterance equipment 1 is connected with a keyboard 2, 
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and a user performs ihe instruction input about the voice which had utterance equipment 1 
single hand and was uttered from the keyboard 2 by the hand of another side. Utterance 
equipment is the case which consisted of plastics of non-insulation etc., as shown in this 
drawing, and it is constituted by the configuration which consists of a curve so that near an 
utterance person's mouth and the portion of a throat may he contacted. 
[0022] Drawing 4 is drawing which looked at the utterance equipment of one example of 
this invention from the rear face. 

[0023] This drawing is drawing at the time of seeing the utterance equipment I shown in 
drawing 2 from a rear face, and the skin surface electrodes 31-42 are arranged in the rear 
face of utterance equipment 1 , each electrodes 3 1-42 are pairs in order to carry out bipolar 
guidance like electrodes 31 and 32 and electrodes 33 and 34, and a myo-electric-signal 
signal is acquired by equipping the front face of an utterance person's face. 
[0024 1 It is made for the portion in which the skin surface electrodes 33, 34, 37, and 38 are 
installed in the above utterance equipments to he equivalent to a throat portion, and an 
utterance person has the skin surface electrodes 3 1 , 32, 35, 36, 39, 40, 41 , and 42 single 
hand so that it may hit around a mouth. And phonation syllable is taught from a keyboard, 
uttering syllable (according to syllabic, a mouth is moved in fact). For example, a mouth is 
moved so that syllable "** n may be uttered, and the depression of the key of"**" of a 
keyboard is carried out simultaneously. It repeats by the syllable of a request of such 
instruction operation. 

[0025] Drawing 5 shows the detailed composition of the utterance equipment of one 
example of this invention. 

[0026] Utterance equipment 1 consists of the skin surface electrodes 31-42 which became 
two or more pairs, differential amplifier 81-86, low cut filters 91-96, high frequency cut 
filters 101-106, FFT 1 1 1*1 16, the recognition section 3. the speech synthesis section 4, 
amplifier 5, and a loudspeaker 6. 

[0027] Ulterance equipment 1 detects the skin surface myo-clcctric-signal signal produced 
with a user's mouth circumference and the muscles of a throat by the skin surface electrodes 
3 1 -42 which became each set in order to carry out the bipolar lead, is amplified with the 
operation amplifier 81-86, fillers low frequency by low cut fillers 91-96, and performs RF 
filtering independently for every channel by high frequency cut filters 1 01 -1 06. Then, in 
FFT 11 1 -1 1 6, a time scries myo-electric-signal signal is changed into a power spectrum, it 
decomposes into each component according to band of the power spectrum which carried 
out octave analysis, and each component according lo band is inputted into the neuron 
network of the recognition section 3. 

10028] Drawing 6 shows the detailed composition of the recognition section of one 
example of this invention, and the speech synthesis section. 

10029] In this drawing, what the recognition section 3 consists of neural networks who 
consist of an input layer 30l\ an intcrlayer 302, and an output layer 303, and is shown by O 
mark shows a unit respectively, and the line which has connected 0 mark and O mark 
expresses a link including a joint load. The recognition section 3 recognizes the syllable in 
which the phonation intention was done by the user based on each component according to 
band inputted from one channel ofFFT 111-1 16, the unit of the recognized voice ignites, 
and a syllabic signal is sent to the speech synthesis section 4. Ihe speech synthesis section 
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4 compounds voice with the syllabic signal outputtcd from the recognition section 3, and 
outpuls it lo amplifier 5. 

[0030] The sound signal compounded with amplifier 5 is amplified, and outputs a syllabic 
signal from a loudspeaker 6. 

[00311 Operation whose utterance person uses utterance equipment for below, and performs 
instruction operation (study processing) to it is explained, 

1 0032 1 Drawing 7 is a flow chart which shows instruction operation of one example of this 
invention. 

[0033J Step 101 The joint load of each unit of the neural network of the recognition section 
3 and a link is first initialized by random numbers. 

[0034] When a step 102 utterance person moves a mouth and the muscles of a line and the 
face move utterance operation, which group of bipolar electrodes 31-42 detects this as a 
myo-clcctric-signal signal, and incorporation is started. 

[0035J Step 1 03 At this time, an utterance person does the depression of the key oJ'the 
keyboard 2 of the syllable which means utterance. The myo-clcctric-signal signal with 
which one filter of the low cut filters 91-96 removed the low-frequency component in the 
time window opened to the time series myo-eleeiric-signal signal by this ( drawing 8 , a), 
and one filter of the high frequency cut fillers 1 01 -1 06 removed the high frequency 
component, and these frequency components were removed is inputted into cither of FliT 
111-116. FET 111-116 performs FFT (fast Fourier transform) processing which changes a 
myo-clcctric-signal signal into a power spectrum, and inputs it into the recognition section 
3. 

[0036J Step 104 On the other hand, the meant syllable which was pushed by the keyboard 2 
is given to the output layer of "**") and the recognition section 3 as a teacher signal in 
( drawina 8 . The outline of the input signal from the keyboard 2 which is the instruction 
syllable signal 70 and teacher signal which arc an input signal to the recognition section 3 is 
shown in drawing^ . In drawing 8 , when ptpnation syllabic is now, the instruction 
syllabic signal 70 as shown in drawing 9 power-spectrum-ized according to the band by 
which FFT was carried out to the input layer 301 of the recognition section 3 is inputted. In 
drawing 9 , the instruction signal 70 (numeric value of the numerical train in which FIT 
was canned out according to the band by FFT 1 1 1 -1 16 and which was normalized by 0-1 in 
the power-speetrum-ized myo-clcctric-signal signal) is inputted into each unit of the input 
layer 301 of the recognition section 3. Moreover, the teacher signal 60 which is an input 
signal from a keyboard 2 is inputted into each unit of the output layer 303 of the 
recognition section 3. the output layer 303 of the recognition section 3 - phonalion syllabic 
,, ** M - corresponding - l, l , 0 and 0, and ... as for 'T 1 and others, the signal of "0" is given 
to "**" like .0" 

[00371 Study of the neural network of the recognition section 3 is performed for example, 
by the error reverse spreading method (47 reference : Nakano *+*♦**" a neuron computer, 
technical Ilyoronsha, p 1 989). While it changes while a time window overlaps, as shown in 
drawing 8 , and the keyboard 2 is pushed, it always continues being given in an instruction 
syllable signal at the neural network of the recognition section 3. 

[0038] The instruction for study processing of the step 105 recognition section 3 repeats the 
above-mentioned step repeatedly until the syllable of all kinds is completed. 
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f 0039 1 After study by the above-mentioned instruction by the recognition section 3 is 

completed, a keyboard 2 is removed from utterance equipment 1 , and a user performs 

phonation operation (recognition processing) of arbitrary syllable only using utterance 

equipment 1 . That is, the phonation syllable meant by speech synthesis from utterance 

equipment 1 is outputted, and communication with the others can be aimed at. 

1 0040 1 Next, the procedure of signal processing at the lime of use (recognition processing) 

is usually explained. When instruction is already completed and an utterance person 

performs arbitrary phonation, utterance equipment 1 recognizes uttered voice. 

(0041 J Drawin g 10 is the flow chart of recognition processing ofphonalion operation of 

one example of this invention. It is premised on instruction (study processing) already 

being completed in the recognition processing shown in this drawing. 

[0042] Moreover, processing of the myo-electric-signal signal at the time of recognition 

processing is shown in drawing 1 1 . 

[0043] Step 201 The incorporation of a myo-electric-signal signal is started lust. 

[0044] A step 202 utterance person performs phonation operation from utterance equipment 

1. 

[0045] The step 203 skin surface electrodes 31-42 detect a myo-electric-signal signal 
according to operation- of the muscles an utterance person's face. At the lime of ml, the 
myo-electric-signal signal s allotted to time series as shown in drawing 1 1 is given to FFT 
111-116 by low cut fillers 91-96 and high frequency cut filters 101-106, filtering and after 
windowing is carried out, and it is inputted into the recognition section 3. The power 
spectrum according to band of the myo-electric-signal signal of tnl is given to the 
recognition section 3 as shown in this drawing, and it recognizes to which syllable the 
myo-electric-signal signal concerned corresponds. 

[0046] By the time the signal recognized by the step 204 recognition section 3 results in 
t=t2, it will be outputted as voice "**" from the speech synthesis section 4. 
[0047] Similarly, when FFT processing is carried out filtering and after windowing is 
carried out, a time series myo-electric-signal signal is given to the same recognition section 
3 as an input at the time of I— 12 and utterance "**" is continued after recognition processing, 
the utterance vicarious execution also of the time of t=t2 is carried out as "**" from the 
speech synthesis section 4. 

[0048] After step 205, processing with the same said of t3 and t4 — is performed, and the 
syllable by which utterance was meant is outputted from die speech synthesis section 4. 
[0049 1 this invention, without being limited to the above-mentioned example as an 
applicable field In the environment where it is required and voice cannot be uttered the 
application which uses together with the existing speech recognition means as auxiliary 
means of speech recognition which recognize the sound signal other than the application 
also as a non-pharynx person's utterance vicarious execution equipment, and gathers the 
rate of speech recognition, and calm ** - As lor utterance, using, when speech recognition 
cannot be carried out in the existing speech recognition under the high noise environment 
which carries out only utterance operation and it applies to oral statement document 
preparation, without carrying out etc. can consider application variously. 
[0050] 

[Effect of the InvcntionJ As mentioned above, when using the utterance equipment of this 
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invention, and a user makes utterance equipment learn beforehand ihe relation of the 
syllable which means its own myo-electric-signal pattern and utterance first, recognition 
which was adapted for a user individual's pattern can be performed, it accepts and compares 
and there is an advantage which uses recognition equipment without machine learning that 
a recognition rate is high. 

[005 1 ] Moreover, since the user individual is fitted, a user docs not newly need to train 
utterance operation and maintenance of natural utterance operation can be expected. 
[0052] Furthermore, since the skin surface myo-clcctric-signal signal detected by two or 
more skin surface electrodes is decomposed into frequency PEKUTORU and the 
component according to band is processed in parallel in the recognition section, 
discernment of the difficult myo-electric-signa) signal to a consonant is also fully possible 
conventionally. 
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[Brief Description of the Drawings] 

[Drawing 1] It is the principle block diagram of this invention. 

[Drawing 2] It is drawing for explaining the outline of this invention. 

f Drawing 3] It is drawing for explaining the system of one example of this invention. 

[Dra wing 4] It is the rear view of the utterance equipment of one example of this invention. 

[Drawing 5] It is the detailed block diagram of the utterance equipment of one example of 

this invention. 

[ Drawing 6] It is the detailed block diagram of the recognition section of one example of 
this invention, and the speech synthesis section. 

[Drawing^] It is the flow chart which shows instruction operation of one example of this 
invention. 

[Drawing 8] It is drawing for explaining recognition processing of one example of this 
invention. 

fDrawing 9] It is drawing showing the input to the recognition section of one example of 
this invention. 

[D rawing 1 0] It is the flow chart of recognition processing of utterance operation of one 
example of this invention. 

1 Drawing 1 11 It is drawing showing the signal state at the time of recognition processing of 
one example of this invention. 
[Description of Notations] 

1 Utterance Equipment 

2 Keyboard 

3 Recognition Section 

4 Speech Synthesis Section 

5 Amplifier 

6 Loudspeaker 
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1 0 Signal-Detection Means 

1 1 Syllabic Discernment Means 

12 Study Means 

1 3 Recognition Means 

14 Speech Synthesis Means 

15 Voice Output Means 
31-42 Skin surface electrode 
60 Teacher Signal 

70 Myo-Llcctric-Signal Signal 
81-86 Differential amplifier 
91-96 Low cut filter 
101-106 High frequency cut filler 
111-116 Ml 

301 Input Layer 

302 Interlaycr 

303 Output Layer 



